Anomalous cases of astronaut helmet detection 
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ABSTRACT 

An astronaut’s helmet is an invariant, rigid image element that is well suited for identification and tracking using current 
machine vision technology. Future space exploration will benefit from the development of astronaut detection software 
for search and rescue missions based on EVA helmet identification. However, helmets are solid white, except for metal 
brackets to attach accessories such as supplementary lights. We compared the performance of a widely used machine 
vision pipeline on a standard-issue NASA helmet with and without affixed experimental feature -rich patterns. 
Performance on the patterned helmet was far more robust. We found that four different feature -rich patterns are 
sufficient to identify a helmet and determine orientation as it is rotated about the yaw, pitch, and roll axes. During helmet 
rotation the field of view changes to frames containing parts of two or more feature -rich patterns. We took reference 
images in these locations to fill in detection gaps. These multiple feature -rich patterns references added substantial 
benefit to detection, however, they generated the majority of the anomalous cases. In these few instances, our algorithm 
keys in on one feature -rich pattern of the multiple feature -rich pattern reference and makes an incorrect prediction of the 
location of the other feature -rich patterns. We describe and make recommendations on ways to mitigate anomalous cases 
in which detection of one or more feature -rich patterns fails. While the number of cases is only a small percentage of the 
tested helmet orientations, they illustrate important design considerations for future spacesuits. In addition to our four 
successful feature -rich patterns, we present unsuccessful patterns and discuss the cause of their poor performance from a 
machine vision perspective. Future helmets designed with these considerations will enable automated astronaut detection 
and thereby enhance mission operations and extraterrestrial search and rescue. 
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1. INTRODUCTION 

Space suits are engineered to protect astronauts from harsh extraterrestrial conditions during space missions. Astronaut 
helmets are pressurized to maintain a life -supporting environment while providing impact protection and communication 
systems [1] . Thus, a rigid helmet remains common to both current and proposed spacesuit designs (e.g., the BioSuit [2] ). 
The enduring rigid helmet design is of interest for machine vision because visual identifiers may be readily attached. An 
astronaut helmet is an excellent platform for affixing visual patterns due to its rigidity and geometry. The surface of the 
helmet is wrinkle-free, is without sharp comers, and smoothly changes in terms of depth, width, and height. An attached 
pattern is viewable through a modest range of pitch, yaw, and roll rotations with gradual distortion. 

Future space exploration will benefit from the development of astronaut detection capability for managing EVA 
operations, for rover following [3] and for search and rescue missions. 

For rigid objects and fixed scenes, current machine vision technology is capable of identifying imagery rapidly and with 
specificity over a modest range of camera viewpoints and scene illumination. The machine vision pipeline that we used 
employs a perspective transformation^ 1 that maps the outline of an object from one camera viewpoint (the reference) to 
the outline of the object in a moderately different camera viewpoint (the test). In this application we attempt to apply the 
pipeline to changing scene (an astronaut helmet with arbitrary 3D rotation) with the camera viewpoint fixed. As a result, 
some anomalies in perspective transformation occurred, and we document them in this paper. 
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In this study, we located a spacesuit helmet in a video stream as the helmet orientation varied through a wide range of 
pitch, yaw, and roll angles by recognizing several affixed feature-rich patterns (FRPs). Four FRPs were affixed to the 
back, top, left, and right regions helmet. We previously demonstrated^ 1 that location of a helmet via machine vision is 
much more reliable if the helmet is augmented with four FRPs. In this paper, we discuss two topics: 1) FRPs that were 
not successful and why from machine vision perspective; and, 2) anomalous cases in which a standard machine vision 
algorithms yielded false identifications. 


2. METHODS AND MATERIALS 

We mounted a standard issue EVA helmet to a tripod for detection using a c920 Logitech webcam. The tripod allowed 
for a wide range of rotations about the pitch, yaw, and roll axes. The range of observations included 360 degrees of yaw, 
90 degrees of roll, and 80 degrees of pitch with measurements taken at 10-degree increments. The tripod only allowed 
for two simultaneous axis manipulations, thus the experiment required two sub -experiments of roll-yaw and pitch-yaw. 
The camera and tripod positions were fixed, but the helmet was nearer to the camera for some pitch-yaw-roll positions 
than others. The camera zoom was fixed so that the helmet nearly filled the video frame (vertical extent, 480 pixels) for 
the helmet position closest to the camera. For helmet positions furthest from the camera, the helmet filled about 30% of 
the video frame. 

2.1 Feature-rich patterns 

We tested several candidate images with an aerospace theme (Figure 1) for performance through axial rotations of the 
tripod-mounted helmet. To create reference images of these patterns for image matching, images of the tripod -mounted 
helmet were captured at selected rotations with resolution of 640 by 480 pixels, and a rectangular bounding box was 
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Figure 1. Feature rich patterns used in astronaut detection: comet (a), astronaut (b), lightning bolt (c through e), filled-in 
astronaut (f), comet (g), and rover (h). We found the filled in patterns (d through h) to be more robust in astronaut detection 
than the outlines in (a) through (c). 

drawn to define the area of interest on the helmet without extending into the background of the scene. This method 


limited the resolution of the reference imagery, which ranged from 64 x 86 pixels (an image of the comet) to 256 x 241 
pixels (an image containing the comet, the rover and lightning bolt). 

2.2 Machine Vision 

RFPs were detected using OpenCV version 2.9 [6] on a Linux machine with 16 cores (32 virtual cores with hyper 
threading) and 4 NVIDIA GPUs. Our pipeline consisted of five steps: 1) convert video frame to grayscale 2) extract 
features using the SURF feature detector [7] , 3) match features between reference image to frame 4) determine if matches 
constitute a positive identification of a reference pattern 5) draw box around identified reference pattern in video frame. 
Detection was rapid (10 frames per second or faster) even with a video stream busy with “keypoints,” which are points 
significant geometric concavity or convexity [8] . 

The SURF feature detector finds keypoints in the current frame and creates a geometric descriptor vector. This 
descriptor is compared to a descriptor vector of the reference images. Matching of descriptors is accomplished using L2 
norm. SURF rejects matches that are insufficiently unique to avoid incorrect matching. If the descriptor resembles a 
reference image during the matching process and is sufficiently unique, then a box is drawn in the frame based off the 
keypoint locations. Multiple boxes may be drawn in the same frame due to positive identification of multiple references. 

2.3 Single FRP vs. Single and Multiple FRP 

In our previous study [5] , we showed that by affixing FRPs to a helmet, detection was much more robust than the feature 
poor experiment, in which the reference images used for matching were pictures taken of an unaugmented helmet (no 
affixed FRPs). Using four FRPs, we achieved positive identification of the augmented helmet using 14 reference images 
at approximately two times more rotational positions than we achieved using 117 reference images of the unaugmented 
helmet. 

The rotation of the helmet yielded several instances in which the viewable frame consists of two or more spatially 
distorted FRPs as shown in Figure 2. The comet, rover, and lightning bolt are simultaneously visible in the frame. Single 
FRP reference images captured at these rotations are detectable only over a small range of helmet positions, and 
reference images of an FRP taken at other angles do not perform well at these rotations due to the out of plane and 
scaling effects. 



Figure 2. A rotation angle in which multiple FRPs are visible on the augmented helmet. 

To mitigate this, four multiple FRP reference images were added to the reference image library of ten single FRP images 
to increase continuity of detection across rotation in pitch, yaw, and roll. The multiple FRPs not only filled gaps in 
detection, but also added redundancy in detection in positions previously identified by single FRP. Figure 3 and Figure 4 
are positive identification plots for roll vs. yaw and pitch vs. yaw. The experimental setup included a standard issue EVA 
helmet mounted to a tripod that allowed for two simultaneous axes of rotation, thus two experiments are necessary to 
capture rotation in three dimensions. Figure 3a and Figure 4a show identification results where multiple FRP are 
removed from rotation while Figure 3b and Figure 4b include multiple FRP. The Not Possible area (shown darkest gray) 
indicates rotations for which the rear of helmet is not in view and the reflective visor occupies the frame. The Difficult 
area (light gray) indicates rotations of the helmet at which all reference images is severely distorted. Finally, the Possible 
(yellow) indicates rotations for which we expect to see positive identification. When multiple positive identifications 
occur for a given pitch, yaw, and roll location multiple colors are used to fill the space. A key indicates which reference 
image was detected at a given position. For rotation positions with more than one detected reference image (we observed 
a maximum of four), up to 4 colors are included in the corresponding cell of Figure 3 and Figure 4. 



In the single FRP experiment (Figure 3a) the helmet was detected at most rotations, but for some roll-yaw rotations there 
are notable gaps (e.g., at 120 degrees of yaw, white circle). The addition of the multiple FRP reference images filled 
these gaps (Figure 3b). 

The addition of multiple FRP reference images to the procedure filled in two dropout locations in the pitch -yaw 
detection plot (Figure 4a): yaw 220 and pitch 90, and yaw 220 and pitch 160. As in the roll vs. yaw experiment in Figure 
3, the multiple FRP procedure provided for continuous positive identification while adding redundancy in detection. 
The increased detection stability afforded by the inclusion of multiple FRP imagery had consequences though. All of the 
anomalous cases described in section 3.2 arise from the added multiple FRP imagery. 
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Figure 3. Roll vs. yaw experiment with single FRP reference images (a) and with both single and multiple FRP reference 
images. Pitch is held constant at 90 degrees. Dark gray color indicates locations at which no detection is expected, light gray 
indicates locations at which only a distorted part of a reference may appear in the video frame, and yellow locations are 
expected to produce positive identification. The two white circles in (a) indicate locations at which multiple FRPs are 
positively identified in (b). 


3. RESULTS AND DISCUSSION 

In section we discuss the characteristics that successful FRPS have in common and also discuss anomalous cases of FRP 
identification. 


3.1 Unsuccessful versus successful FRPs 


While testing FRPs we observed characteristics a successful patterns. Solid FRPs (Figure ld-h) performed better as the 
helmet was rotated about the pitch, yaw, and roll axes than FRPs drawn as outlines (Figure la-c). If the camera is 
orthogonal to a pattern filled or solid, the inflection points along the edges of outline and solid FRPs are similar 
(although the filled in FRPs have a greater number of keypoints at multiple scales). However, as the helmet is rotated so 
that an FRP not orthonormal to the camera viewpoint, the outline FRP lines become thin and intermittent in the video 
frame. While the solid FRPs may have edge warping and distortion during out of plane rotations, the filled -in space 
creates keypoints that exist on multiple scales, thus making filled-in FRPs more stable across more viewing angles. 
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Figure 4. Pitch vs. yaw experiment with single FRP reference images (a) and with both single and multiple FRP reference 
images. Roll is held constant at 90 degrees. Dark gray color indicates locations at which no detection is expected, light gray 
indicates locations at which only a distorted part of a reference may appear in the video frame, and yellow locations are 
expected to produce positive identification. The two white circles in (a) indicate locations at which multiple FRPs are 
positively identified in (b). 

Despite being filled in, the lightning bolt in Figure Id was not robust. We explored the reason for this by observing 
matching at the keypoint level in Figure 5 (left). The matches between the drawn lightning bolt and the actual reference 
are shown with similarity rejection on and off. The matches are rejected due to SURF’s similarity rejection criterion, 
which ensures that only distinct inflection points are used for matches to avoid incorrect matching. In particular for the 
lightning bolt in Figure 5 left, the two 60° corners on its left side (solid arrows) are similar and close together. We found 
that the corners indicated by the solid arrows are rejected by SURF. To mitigate inflection point rejection, we truncated 



the upper corner in the modified design (Figure 5 right), which yielded matches that survived the similarity rejection 
criteria of SURF. Matching is computed by SURF for all rotations and translations, so as to achieve translation-invariant 
and rotation-invariant object detection. Subsequent to matching, matched features that are very similar are removed from 
consideration, so that only unique distinguishable features are used to match the object as a whole. 

The most successful FRPs are solid (as opposed to outlines) and have a high number of spatially distinct concave/convex 
vertices per area. FRPs drawn with thin lines fail to provide robust keypoints through rotation across a wide set of poses. 
In Figure 1, the astronaut and the rover have far more distinct vertices at greater density than the comet and lightning 
bolt. These additional vertices provide keypoints that survive SURF’s elimination criteria. 

Additionally, the approximately 1:1 aspect ratio seems to be more robust and resilient to out of plane distortions. There 
were a number of instances with severe to moderate visual distortion where the rover and astronaut would be detected, 
however, the lightning bolt and comet were not detected. Furthermore, the lightning bolt and comet were not always 
detected even if the distortion was minor. 



Figure 5. Feature similarity rejection as a design constraint. To improve detectability, similar features of the original 
lightning bolt design (left images) were perturbed to give a more robust pattern (right images). Clockwise from left for both 
left and right image sets: drawn image, unmatched (green circles) and matched (blue circles and lines) features with 
similarity rejection disabled, unmatched and matched features with similarity rejection enabled, features represented as a 
circle with a radius corresponding to its spatial scale. The arrows indicate example areas of modification to avoid similar 
spatial structure, which results in similarity rejection. 

3.2 Anomalous Cases 

In the non-FRP experiment, for a fixed helmet position and camera viewpoint, match boxes were stable when observed. 
In the FRP experiment, for single FRP references, match boxes were also stable for a fixed helmet position and camera 
viewpoint. For multiple FRP references, however, the match box would sometimes flicker with a period of 
approximately 20 frames and a duty cycle approximately 50%. 

Beyond this one type of dynamic instability, we observed four classes of static anomalies, which are discussed in turn 
below: 1) accurate projection of a multiple FRP not in entirely in view; 2) out of scale beyond the frame; 3) out of plane; 
and, 4) incorrect detection box orientation arising from a feature mismatch. 

Accurate projection of FRP not entirely in view 

The machine vision pipeline used in this study is designed to project a bounding box of an object in a reference camera 
viewpoint to a new bounding box for the object as the camera viewpoint changes [4] via epipolar geometric perspective 
transformation. In this study, the ‘object’ is an area on a helmet, and it may rotate so that portions of the object may 
move entirely out of the camera view; we knew that the incremental angular variation that occurs with a moderate 
perspective change is insufficient to capture the large angular changes that occur in viewing the surface of a rotating 
sphere. For example, if a portion of a multiple FRP is not in the camera frame due to the helmet orientation, this can lead 




to an anomaly in which part of the match box is overlaid on the helmet but part is overlaid on the background scene. 
Since we did not have a bounding box transformation module available for the severe geometric variation that we 
encountered, we documented anomalies that arose from the oversimplified model of bounding box transformation. 

In Figure 6, half of the multiple FRP is correctly identified and the second half of the identification box is projected in 
the correct direction with a reasonable scale and in the expected plane. In Figure 6a the lightning bolt in the video frame 
(left) is identified using the reference image (right). The positive identification box is drawn in the correct orientation 
and scale despite the comet not being visible in the frame. Similarly in Figure 6b, the comet is locked on and the 
identification box is projected in the direction of the lightning bolt. 
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Figure 6. Identification and correct projection using only half of the multiple FRP of the comet and lightning bolt. In 5a, 
only the lightning bolt is in the camera’s field of view for the matching box indicated by the block arrows. The machine 
vision pipeline accurately projects the direction of the comet. In 5b, only the comet is in the camera’s field of view. 

Out of scale beyond the frame 

In Figure 7 the multiple FRP of the comet and lightning bolt is identified at the correct scale (thick match box), yet the 
multiple FRP (right) indicated by the white block arrow (left) is drawn with a large box extending beyond the frame. 
This arises for two reasons: 1) the epipolar projection model is not well suited to this geometry and 2) several features of 
the reference are not matched. 



Figure 7. Out of viewport projection. At left we see positive identification of the reference image (right). However, it is 
greatly out of scale. We believe that the machine vision pipeline is matching the comet FRP and missing the lightning bolt 
FRP, but the match status of the astronaut FRP and mounting bolts are not clear. 

Out of plane 

In Figure 8, a multiple FRP is identified but the plane of projection is not intuitive. This is another example of how an 
epipolar perspective transformation can fail for a reference image wrapped on a roughly spherical surface. A manual 




rotation of the reference at top right is shown at bottom right; with minor warping this conforms to the match box in the 
video frame at left. 



Figure 8. Limitations of perspective transformation (3D case). The matched reference is at top right. Manual rotation of the 
reference counterclockwise in the plane and with the top edge into the plane is shown at bottom right. 

Incorrect detection box orientation arising from feature mismatch 

In Figure 9a, it is apparent that the lightning bolt was identified in the incorrect orientation and thus the homography 
projection is in the wrong direction at the roughly the correct scale. In Figure 9b rover is identified but the homography 
is drawn at the incorrect scale and in the wrong orientation. A close examination of the matches indicated by the lines 
between the frame and the reference image reveals that keypoints on the lightning bolt (a) and rover (b) are mismatched. 

This is the most serious class of anomaly. To avoid incorrect matches of image features, we envision two improvements 
to the current method. First, patterns should be optimized for detection, as discussed in section 3.1. Second, both 
reference and test image resolution should be increased by using an imager with 1080p (or better) capability or by 
adopting a pan-tilt-zoom image capture architecture so that helmet fills as much of the camera frame as possible. An 
example of a test of this method at 1080p resolution [9] shows few anomalies of this type. 
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Figure 9. Incorrect detection box orientation arising from feature mismatch. 

4. CONCLUSIONS 

In this paper we presented our observations and analysis of the characteristics of successful FRPs when compared to 
unsuccessful FRPs. Future design of space safety systems using FRPs may benefit from our trials for the purpose of 
astronaut detection. FRPs must be carefully scrutinized for similarity rejection criteria. Successful FRPs possess a high 
number of spatially distinct concave or convex vertices per area to survive SURF’s similarity rejection criteria. As the 
details of similarity rejection are difficult to discern with the naked eye, we recommend testing several FRPs with SURF 
to illuminate the robustness of a given FRP. 



We presented anomalous cases and made recommendations on mitigation. While inclusions of multiple FRPs generated 
important positive identification and added beneficial redundancy, they are responsible for the anomalous cases. In order 
to reduce the number anomalies, we recommend the following, a) A bounding box projection method that better models 
the geometry is needed to remove one major source of anomalies, b) A sensor with 1080p (or better) resolution or a pan- 
tilt-zoom (PTZ) image capture method should be employed to minimize incorrect feature matching when the helmet 
occupies a small portion of the camera frame. PTZ cameras may be programmed to zoom in on movement — a machine 
vision method such the one described in this study could check for astronauts and subsequently follow their movements 
during a spacewalk. 
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